[ENH] V1 → V2 API Migration - studies by rohansen856 · Pull Request #1610 · openml/openml-python

rohansen856 · 2026-01-08T20:47:22Z

Metadata

Reference Issue: [ENH] V1 → V2 API Migration - studies #1594 (towards [ENH] V1 → V2 API Migration #1575)
New Tests Added: No
Documentation Updated: No

Details

Stackend PR, Depends on #1576

This PR adds Studies v2 migration.

A question:
Due to the pre commit hook i could not put 6 arguments in a function, so i had to workaround that with this instead:
openml_api\resources\studies.py (line 10-15)

        limit = kwargs.get("limit")
        offset = kwargs.get("offset")
        status = kwargs.get("status")
        main_entity_type = kwargs.get("main_entity_type")
        uploader = kwargs.get("uploader")
        benchmark_suite = kwargs.get("benchmark_suite")

I would like to confirm if this approach is correct or not. Raising a draft PR for now.

Signed-off-by: rohansen856 <rohansen856@gmail.com>

codecov-commenter · 2026-01-08T20:53:56Z

Codecov Report

❌ Patch coverage is 50.22693% with 329 lines in your changes missing coverage. Please review.
✅ Project coverage is 52.06%. Comparing base (d421b9e) to head (18dc72a).

Files with missing lines	Patch %	Lines
openml/_api/clients/http.py	24.46%	142 Missing ⚠️
openml/_api/resources/base/versions.py	24.71%	67 Missing ⚠️
openml/_api/resources/study.py	25.00%	33 Missing ⚠️
openml/_api/runtime/core.py	55.38%	29 Missing ⚠️
openml/_api/resources/base/fallback.py	26.31%	28 Missing ⚠️
openml/testing.py	48.71%	20 Missing ⚠️
openml/_api/config.py	95.45%	3 Missing ⚠️
openml/_api/resources/base/base.py	76.92%	3 Missing ⚠️
openml/study/functions.py	50.00%	2 Missing ⚠️
openml/_api/__init__.py	88.88%	1 Missing ⚠️
... and 1 more

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1610      +/-   ##
==========================================
+ Coverage   52.04%   52.06%   +0.02%     
==========================================
  Files          36       58      +22     
  Lines        4333     4965     +632     
==========================================
+ Hits         2255     2585     +330     
- Misses       2078     2380     +302

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

rohansen856 · 2026-01-13T07:12:28Z

Implementing noqa instead of the kwargs following example from here: openml\testing.py:

    def _check_fold_timing_evaluations(  # noqa: PLR0913
        self,
        fold_evaluations: dict[str, dict[int, dict[int, float]]],
        num_repeats: int,
        num_folds: int,
        *,
        max_time_allowed: float = 60000.0,
        task_type: TaskType = TaskType.SUPERVISED_CLASSIFICATION,
        check_scores: bool = True,
    ) -> None:

Final function signature:

    def list(  # noqa: PLR0913
        self,
        limit: int | None = None,
        offset: int | None = None,
        status: str | None = None,
        main_entity_type: str | None = None,
        uploader: list[int] | None = None,
        benchmark_suite: int | None = None,
    ) -> Any:

Signed-off-by: rohansen856 <rohansen856@gmail.com>

geetu040

Good work. Just use the listing as suggested in #1575 (comment) which is already similar to what you have done.

for more information, see https://pre-commit.ci

rohansen856 · 2026-01-15T08:35:54Z

@geetu040 I reviewed the specific changes needed and have a slight doubt in the pandas implementation.
So as i undertand, i need to use pandas Dataframe insteaf of ANY in openml\_api\resources\base.py like this:

class StudiesAPI(ResourceAPI, ABC):
    @abstractmethod
    def list(  # noqa: PLR0913
        self,
        limit: int | None = None,
        offset: int | None = None,
        status: str | None = None,
        main_entity_type: str | None = None,
        uploader: list[int] | None = None,
        benchmark_suite: int | None = None,
    ) -> pd.DataFrame: ...

and similarly i have to change the return object in openml\_api\resources\studies.py from this:return response.text
to this:

xml_string = response.text

        # Parse XML and convert to DataFrame
        study_dict = xmltodict.parse(xml_string, force_list=("oml:study",))

        # Minimalistic check if the XML is useful
        assert isinstance(study_dict["oml:study_list"]["oml:study"], list), type(
            study_dict["oml:study_list"],
        )
        assert (
            study_dict["oml:study_list"]["@xmlns:oml"] == "http://openml.org/openml"
        ), study_dict["oml:study_list"]["@xmlns:oml"]

        studies = {}
        for study_ in study_dict["oml:study_list"]["oml:study"]:
            # maps from xml name to a tuple of (dict name, casting fn)
            expected_fields = {
                "oml:id": ("id", int),
                "oml:alias": ("alias", str),
                "oml:main_entity_type": ("main_entity_type", str),
                "oml:benchmark_suite": ("benchmark_suite", int),
                "oml:name": ("name", str),
                "oml:status": ("status", str),
                "oml:creation_date": ("creation_date", str),
                "oml:creator": ("creator", int),
            }
            study_id = int(study_["oml:id"])
            current_study = {}
            for oml_field_name, (real_field_name, cast_fn) in expected_fields.items():
                if oml_field_name in study_:
                    current_study[real_field_name] = cast_fn(study_[oml_field_name])
            current_study["id"] = int(current_study["id"])
            studies[study_id] = current_study

        return pd.DataFrame.from_dict(studies, orient="index")

A total of 3 files would be affected: openml\_api\resources\base.py, openml\_api\resources\studies.py and openml\study\functions.py

Can you please confirm my approach... After that i will update the PR.

geetu040 · 2026-01-15T08:55:24Z

@rohansen856 yes sounds right

Signed-off-by: rohansen856 <rohansen856@gmail.com>

rohansen856 · 2026-01-15T10:42:12Z

Updated! Ready for review.

geetu040

Almost fine, just complety remove _list_studies as well and replace _list_studies with api_context.backend.studies.list as the parameter for partial in list_studies. Hope I didnot confuse you, just search for the exact method names in code. Let me know if I am not clear enough.

rohansen856 · 2026-01-16T09:45:32Z

Almost fine, just complety remove _list_studies as well and replace _list_studies with api_context.backend.studies.list as the parameter for partial in list_studies. Hope I didnot confuse you, just search for the exact method names in code. Let me know if I am not clear enough.

Oh definitely! I prolly missed that in openml\study\functions.py but pushing the change with next commit.

…list Signed-off-by: rohansen856 <rohansen856@gmail.com>

This reverts commit fd43c48.

Signed-off-by: rohansen856 <rohansen856@gmail.com>

source: openml#1606 (comment)

…into studies-migration # Conflicts: # openml/_api/__init__.py # openml/_api/resources/base/resources.py # openml/_api/resources/study.py

EmanAbdelhaleem · 2026-02-04T16:53:54Z

openml/_api/resources/study.py

+
+
+class StudyV1API(ResourceV1API, StudyAPI):
+    def list(  # noqa: PLR0913


I think we can split this into 3 functions for more readability:

list()

_build_url()

_parse_list_xml()

check #1606 for reference

Understood! will deparate the long list function into the said 3 functions with proper docstring. applying with next commit.

EmanAbdelhaleem · 2026-02-04T16:55:20Z

tests/test_api/test_study.py

@@ -0,0 +1,94 @@
+# License: BSD 3-Clause
+from __future__ import annotations


I think it would be better to change the file name to "test_study" for consistency

Agreed! applying with next commit.

also in this case similarly, tests\test_study folder should be renamed to tests\test_studies.
cc @geetu040

makes sense, but let's not do it here, that will make the file hard to review with visible changes

EmanAbdelhaleem · 2026-02-04T16:59:11Z

tests/test_api/test_study.py

+            assert all(studies_df["status"] == "active")
+
+    @pytest.mark.uses_test_server()
+    def test_list_pagination(self):


I don't think we need to test pagination here. These tests should only be specific for the API. It's better to leave this test on test_study_functions if it's there.

there is actually no pagination test in test_study_functions. Implementing this here should be fine... LMK if do u think we need to remove it still...

EmanAbdelhaleem · 2026-02-04T17:06:27Z

tests/test_api/test_studies.py

+
+    def setUp(self) -> None:
+        super().setUp()
+        self.api = StudyV2API(self.http_client)


This is v2, you need to use

self.v2_client = self._get_http_client( server="http://localhost:8001/", base_url="", api_key="", timeout_seconds=self.timeout_seconds, retries=self.retries, retry_policy=self.retry_policy, cache=self.cache, )

and change the server to your local v2 server

Understood!
replacing this:

self.api = StudyV2API(self.http_client)

with this:

self.v2_client = self._get_http_client( server="http://localhost:8001/", base_url="", api_key="", timeout=self.timeout, retries=self.retries, retry_policy=self.retry_policy, cache=self.cache, ) self.api = StudyV2API(self.v2_client)

EmanAbdelhaleem · 2026-02-04T17:18:14Z

tests/test_api/test_studies.py

+        self.v2_api = StudyV2API(self.http_client)
+
+    @pytest.mark.uses_test_server()
+    def test_v1_v2_compatibility(self):


I think this should test that the output matches and follow the naming style mentioned here: #1575 (comment)

check #1603 for reference

EmanAbdelhaleem · 2026-02-04T17:20:35Z

tests/test_api/test_studies.py

+        # Both should have delete, tag, untag from base
+        for method in ["delete", "tag", "untag", "publish"]:
+            assert hasattr(self.v1_api, method)
+            assert hasattr(self.v2_api, method)


I think you need to add Fallback tests as mentioned here: #1575 (comment)

check #1603 for reference

Understoo! will implement FallbackProxy and a test_list_fallback function that tests the FallbackProxy automatically falls back from V2 to V1 when V2 raises not supported. also in case of test_list_matches i think it should be marked with @pytest.mark.skip(reason="V2 list not yet implemented") as it currently throws OpenMLNotSupportedError...

Signed-off-by: rohansen856 <rohansen856@gmail.com>

geetu040 and others added 9 commits December 30, 2025 09:11

set up folder structure and base code

0159f47

Merge branch 'main' into migration

58e9175

Merge branch 'main' into migration

bdd65ff

fix pre-commit

52ef379

refactor

5dfcbce

implement cache_dir

2acbe99

refactor

af99880

Merge branch 'main' into pr/1576

74ab366

feat: added migrations for studies api v2

9100d91

Signed-off-by: rohansen856 <rohansen856@gmail.com>

geetu040 mentioned this pull request Jan 9, 2026

[ENH] V1 → V2 API Migration #1575

Open

25 tasks

chore: fixed the args limit in function using noqa

88077a7

Signed-off-by: rohansen856 <rohansen856@gmail.com>

rohansen856 marked this pull request as ready for review January 13, 2026 07:21

geetu040 suggested changes Jan 13, 2026

View reviewed changes

rohansen856 and others added 2 commits January 15, 2026 13:46

Merge branch 'main' into studies-migration

13acf35

[pre-commit.ci] auto fixes from pre-commit.com hooks

e02e05b

for more information, see https://pre-commit.ci

geetu040 and others added 3 commits January 15, 2026 14:51

undo changes in tasks/functions.py

4c75e16

Merge branch 'main' into migration

5762185

chore: updated the list function acc to reviews

9170edc

Signed-off-by: rohansen856 <rohansen856@gmail.com>

geetu040 suggested changes Jan 15, 2026

View reviewed changes

chore: removed _list_studies and implemented api_context for studies …

8c980c9

…list Signed-off-by: rohansen856 <rohansen856@gmail.com>

geetu040 assigned rohansen856 Jan 19, 2026

geetu040 added 3 commits January 21, 2026 10:47

Merge branch 'main' into migration

7e9bc1f

add tests directory

c603383

use enum for delay method

ff6a8b0

geetu040 and others added 17 commits February 2, 2026 12:33

Merge branch 'main' into migration

68820fe

add test: test_tag_and_untag

567eca4

implement get/set_config_values

b2287c3

improve APIBackend.set_config_values

b7e285e

use LegacyConfig

fd43c48

Revert "use LegacyConfig"

f4aab6b

This reverts commit fd43c48.

implement _sync_api_config

d43cf86

update tests with _sync_api_config

3e323ed

rename config: timeout -> timeout_seconds

9195fa6

use timedelta for default ttl value

5342eec

update tests, adds v2/fallback

adc0e74

add MinIOClient in TestBase

bfb2d3e

refactor: replace api_context.backend.study with openml._backend.study

ee10f59

Signed-off-by: rohansen856 <rohansen856@gmail.com>

chore: removed unneccesary test for studies

9c0ad45

Signed-off-by: rohansen856 <rohansen856@gmail.com>

fix linting for builder

cabaecf

fix unbound variables: "code", "message"

85c1113

source: openml#1606 (comment)

Merge branch 'migration' of https://github.com/geetu040/openml-python …

0458929

…into studies-migration # Conflicts: # openml/_api/__init__.py # openml/_api/resources/base/resources.py # openml/_api/resources/study.py

EmanAbdelhaleem reviewed Feb 4, 2026

View reviewed changes

rohansen856 added 4 commits February 5, 2026 10:56

refactor: updated StudyV1API acc to reviews

5e3fea8

Signed-off-by: rohansen856 <rohansen856@gmail.com>

refactor: updated studies test acc to reviews

fc32488

Signed-off-by: rohansen856 <rohansen856@gmail.com>

chore: removed delete method test from studies api test

eda66ca

Signed-off-by: rohansen856 <rohansen856@gmail.com>

refactor: updated study api test filename

18dc72a

Signed-off-by: rohansen856 <rohansen856@gmail.com>

rohansen856 requested review from EmanAbdelhaleem and geetu040 February 5, 2026 05:37

rohansen856 marked this pull request as ready for review February 5, 2026 05:40



		class StudyV1API(ResourceV1API, StudyAPI):
		def list( # noqa: PLR0913

		@@ -0,0 +1,94 @@
		# License: BSD 3-Clause
		from __future__ import annotations

Uh oh!

Conversation

rohansen856 commented Jan 8, 2026

Metadata

Details

Uh oh!

codecov-commenter commented Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

rohansen856 commented Jan 13, 2026

Uh oh!

geetu040 left a comment

Choose a reason for hiding this comment

Uh oh!

rohansen856 commented Jan 15, 2026

Uh oh!

geetu040 commented Jan 15, 2026

Uh oh!

rohansen856 commented Jan 15, 2026

Uh oh!

geetu040 left a comment

Choose a reason for hiding this comment

Uh oh!

rohansen856 commented Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

EmanAbdelhaleem Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

EmanAbdelhaleem Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

EmanAbdelhaleem Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

codecov-commenter commented Jan 8, 2026 •

edited

Loading

rohansen856 commented Jan 16, 2026 •

edited

Loading

EmanAbdelhaleem Feb 4, 2026 •

edited

Loading

EmanAbdelhaleem Feb 4, 2026 •

edited

Loading

EmanAbdelhaleem Feb 4, 2026 •

edited

Loading